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ABSTRACT 

Gene translocations play an important role in the 
plasticity and evolution of bacterial genomes. In 
this study, we investigated the impact on gene regu- 
lation of three genome organizational features that 
can be altered by translocations: (i) chromosome 
position; (ii) gene orientation; and (iii) the distance 
between a target gene and its transcription factor 
gene ('target-TF distance'). Specifically, we 
quantified the effect of these features on constitu- 
tive expression, transcription factor binding and/or 
gene expression noise using a synthetic network in 
Escherichia coli composed of a transcription factor 
(Lacl repressor) and its target gene (yfp). Here we 
show that gene regulation is generally robust to 
changes in chromosome position, gene orientation 
and target-TF distance. The only demonstrable 
effect was that chromosome position alters consti- 
tutive expression, due to changes in gene copy 
number and local sequence effects, and that this 
determines maximum and minimum expression 
levels. The results were incorporated into a math- 
ematical model which was used to quantitatively 
predict the responses of a simple gene network to 
gene translocations; the predictions were confirmed 
experimentally. In summary, gene translocation can 
modulate constitutive gene expression levels due to 
changes in chromosome position but it has minimal 
impact on other facets of gene regulation. 

INTRODUCTION 

Gene translocations can occur by several common mech- 
anisms including intra-chromosomal recombination, 
transposition and inversions (1,2) resulting in the rapid 
generation of novel phenotypes without changing the 
composition of the genome. The effects of translocations 



on gene regulation are not well understood in bacteria and 
it must be stressed that findings in eukaryotes may not be 
applicable given the fundamental differences in their 
genome organization and their mechanisms of transcrip- 
tion and translation. This study focuses on three key 
features of genome organization that may be altered by 
gene translocation: (i) chromosome position; (ii) gene 
orientation; and (iii) the distance between a target gene 
and its transcription factor gene ('target-TF distance'). 

In bacteria, chromosome position is thought to alter 
gene expression in two ways. Firstly, gene expression de- 
creases with the distance of a gene from the origin of rep- 
lication (oriC) (3-5). The relationship occurs because 
bacterial chromosome replication is initiated at oriC and 
proceeds bidirectionally until reaching the terminus 
region. As a consequence, genes located near oriC have 
more copies, particularly at high rates of growth when 
multiple rounds of chromosome replication are initiated 
(6). Secondly, gene expression appears to increase and 
decrease periodically along the chromosome due to 
DNA compaction and supercoiling (7-10). Eukaryotic 
studies suggest that chromosome position may possibly 
alter other aspects of gene regulation including silencing 
(11,12), timing of expression (13,14), tissue distribution of 
expression (14,15), and the burst size of transcription 
events and stochastic fluctuations in gene expression 
('gene expression noise') (16-18). Whether these effects 
also occur in bacteria is unclear given their very different 
chromosome structures (19), cell division rates and the 
absence of histones in bacteria, which are largely respon- 
sible for chromosome position effects in eukaryotes 
(11,16). 

Gene orientation may also affect bacterial expression 
levels (20) and/or gene expression noise (19). This has 
been proposed based on the greater number of highly ex- 
pressed, essential genes on the leading strand (19) and the 
presence of specific mechanisms in the cell that terminate 
the transcription of genes on the lagging strand to prevent 
collisions between RNA polymerases and replisomes (21). 
The chromosomal distance between two interacting genes 
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(e.g. a transcription factor and its target) is also thought to 
be functionally important because it determines the 
distance the transcription factor needs to diffuse and 
thus its efficacy (note: chromosomal distance does not ne- 
cessarily correspond to the physical distance due to DNA 
folding); it has been postulated that this may explain why 
many transcription factor genes and genes in the same 
pathway are located in close proximity to their target 
genes (22-25). 

To assess the effect on gene regulation of chromosome 
position, gene orientation and the target-TF distance we 
created a synthetic network consisting of a transcription 
factor (LacI) gene and its target gene (yfp regulated by 
PLlacO-1). This system allowed the measurement of 
gene expression in single cells and the tuning of transcrip- 
tion factor activity by adding varying concentrations of an 
inducer molecule (isopropyl-|3-D-thiogalactopyranoside, 
IPTG) to the media. Therefore, we were able to quantify 
multiple features of gene regulation including constitutive 
expression, transcription factor binding and cooperativity, 
and gene expression noise. The synthetic gene network 
enabled the effects of each genome organizational 
feature to be evaluated independently and decoupled 
from physiological control mechanisms (26-29). In 
contrast, random translocation events that occur naturally 
or experimentally typically alter several genome organiza- 
tional features simultaneously and cause a multitude of 
other changes that include the disruption and formation 
of operons (30) and the alteration of cis-regulatory se- 
quences; this complexity would have made it very difficult 
to characterize the specific effects of each genome organ- 
izational feature. 

The study had four main parts. The first part 
characterized the effect of chromosome position and 
gene orientation on constitutive expression and transcrip- 
tion factor binding. We found that chromosome position 
only alters constitutive expression due to differences in 
gene copy number and that gene orientation had no 
effect. We also demonstrated that in the absence of 
flanking terminators, the maximum expression level of a 
gene is more likely to be altered by local sequences at 
different chromosome positions. The second part created 
a model and this was used to predict how gene regulation 
is modulated by translocation of the transcription factor 
gene; these predictions were confirmed experimentally. 
This demonstrated that gene translocations can be used 
to rationally reprogram the output of simple gene 
networks. The third part examined whether target-TF 
distance alters transcription factor activity, and we 
found that it does not. The fourth part investigated 
whether chromosome position affects stochastic fluctu- 
ations in gene expression ('gene expression noise') and 
we demonstrated that it does not. 



MATERIALS AND METHODS 

Plasmids and strains 

Details of the strains and plasmids, and the oligonucleo- 
tides used to construct them, are in Supplementary Tables 
S1-S3. Briefly, plasmids were constructed containing 



monomeric cfp or yfp (provided by R. Tsien, UC San 
Diego, CA, USA) under the control of the PLlacO-1 
promoter which is regulated by LacI (31). Translation of 
cfp and yfp was controlled by the T7 10 5' UTR sequence 
("17 10 RBS') obtained from the pET-lla plasmid 
(Stratagene). These plasmids were used as template for 
PCR amplification and the PCR products were inserted 
into the genome using the lambda Red system (32). 

Five sets of strains were used in the study. The first 
set maintains lad at the native position and varies 
the chromosome position of the target gene 
(PLlacO-l::T7 10 RBS::j'/» on the leading strand 
(Figure IB): yjbl' (HL5135), ga/K (HL5058), jayE 
(HL5133), arpff (HL5033), intS (HL5043), yfjV 
(HL5141), yhdW (HL5036) and glvC (HL5137). The 
second set also maintains lad at the native position and 
varies the chromosome position of the target gene but 
the target gene is on the lagging strand (Figure 1C): 
ilvG' (HL5116), yjbl (HL5042), yjiP (HL5037), 
ykfC (HL5038), galK (HL5131), jayE (HL5059), arpff 
(HL5136), intS (HL5132), yfjV (HL5035), yhdW 
(HL5134) and glvC (HL5034). The third set maintains 
lad at the native position and varies the chromosome 
position of the target gene (on the leading and lagging 
strands) but there are no terminators upstream or down- 
stream of the target gene (Figure 3). In the fourth set we 
varied the chromosome position of lad (with its native 
cis-regulatory sequences) and kept constant the position 
of PLlacO-l::T7 10 RBSv.cfp at intS and PLlacO-l::T7 10 
RBS:: yfp at galK (Figure 4). lad was also inserted into the 
chromosome using lambda Red recombinase (32). In the 
fifth set, we inserted lad directly upstream of 
PLlacO-l::T7 10 RBS::c/p at intS or PLlacO-l::T7 10 
RBSr.yfp at galK (Figure 5). 

Measurements of gene expression, mRNA concentrations 
and gene copy number 

Measurements of gene expression were performed by 
fluorescence microscopy as recently described (33) except 
that all cells were included in the analysis. Cells were 
grown in LB media unless otherwise stated. mRNA con- 
centrations were measured in duplicate independent 
cultures on independent blots using previously described 
protocols and probes (33). Gene copy number was 
measured by quantitative PCR using oligonucleotides to 
amplify the 5' end of the yfp and rrsB (control) genes and 
the relative amounts of DNA were calculated as previ- 
ously described (33). PCR efficiencies were determined 
by serial dilution of template DNA. The standard error 
of the mean and error propagation were calculated by 
standard methods (34). 

Fits and parameter estimation 

For the model that predicts the effect of lad transloca- 
tions (Equations (3)-(5)) we calculated cp to be 7.44 ± 0.68 
by subtracting one from the average dynamic range (see 
Figure 3D). We attempted to obtain h by fitting Equation 
(2) to the data shown in Figure IB but the error in the 
parameters was unacceptably high (Supplementary Table 
S4). Therefore we used the value for n= 1.51 ±0.03, 
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Figure 1. Gene expression at different chromosome positions and gene orientations. Error bars indicate s.e.m. of duplicate measurements. Color of 
the data symbols indicate positions in panel A. Strain numbers in the Materials and Methods section. (A) Diagram showing the gene circuit used in 
these experiments and different chromosome positions of yfp, origin and termination of replication, and lad. (B, C) YFP expression on the leading 
and lagging strands as a function of the shortest distance of yfp from the origin (oriC). Data fitted to Equation (1). (D) Gene copy number as a 
function of chromosome position. (E) Induction curves for yfp at six positions with four on the leading strand (galK, arpB', intS and yhdW) and two 
on the lagging strand (yjbl and glvC). Data fitted to equation shown where a is the maximum induced amount of expression, & is the minimum 
expression, n is the Hill coefficient and K x is the IPTG concentration at half-maximal induction. The observed maximum expression = a + S = con- 
stitutive expression. (F) Maximum YFP expression as a function of minimum YFP expression. Data from panel E. Maximum and minimum 
expression were obtained experimentally (i.e. not from fits) at 1 and 0 mM IPTG. 



which was the weighted value obtained from our measure- 
ments (Supplementary Table S5), as an estimate for /?. 
This value of h was close to the Hill number for the re- 
pressor-operator interaction at the lacZYA promoter 
without DNA looping (n = 1.45) (35). Since the yfp 
reporter is at the galK position, p was set to the value 
for the maximum expression at this location (936 ±121 
a.u.; Supplementary Table S5). /3i acI which is the 
maximum level of LacI expression at each position was 
estimated by the amount of YFP expression at the same 
location. The yfjV induction curves were similar to lad at 
the native position indicating that LacI production at this 
location is likely to be similar and therefore it was used as 
an estimate for (3 0 (225 ± 28 a.u.; Supplementary Table 
S5). Fits were performed using the non-linear least 
squares algorithm of Levenberg-Marquardt in Origin 
(version 7.5, OriginLab). 

Stochastic simulations 

Stochastic simulations were generated using Gillespie's al- 
gorithm programmed in Matlab (R2008a, MathWorks). 



The initial state (repressed or unrepressed) was 
randomly selected and the simulations ran until a steady 
state protein concentration was achieved. Switching 
between the repressed and unrepressed states and tran- 
scription events in the unrepressed state were stochastic. 
We also included the stochastic generation of proteins 
from the mRNA and stochastic degradation events for 
the proteins and mRNAs. The rate constants for transla- 
tion (k P ) and mRNA (k_ M ) and protein (k D ) degradation 
had the same values for all simulations (k P = 1 protein 
(mRNA-time)" 1 , k_ M = 0.05 time" 1 , k D = 0.02 time" 1 ). 
Other parameter values are in the figure legends. 

RESULTS 

Gene expression at different chromosome positions and 
gene orientations 

In the first set of experiments, an identical gene 
(PLlacO-l::T7 10 RBS::yfp) was placed at eight chromo- 
some positions: intS (53.1'), galK (17.0'), jayE (26.0'), 
arpB' (38.8'), yfjV (59.7'), yhdW (73.7'), glvC (83.2') 
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Figure 2. Effect of chromosome position on gene expression depends 
on cell growth rates. Error bars indicate s.e.m. of duplicate measure- 
ments. Normalized gene expression as a function of the shortest 
distance from the target gene to the origin (oriC) in three different 
media. Chromosome positions are on the leading (galK, arpB ', intS 
and yhdW) and lagging (yjbT, yftP, ykfC, jayE , yfjV and ghC) 
strands (strain numbers in Materials and Methods section). Gene ex- 
pression was normalized to the level measured at the position closest to 
the origin (ghC). The predicted function is determined by Equation 
(1). The function was completely determined by the values for C and 
the doubling times (r) which were obtained independently of this plot 
(Supplementary Figure SI). Intercept is 1.0 because the data were 
normalized. 



and yjbT (91.6') (Figure 1A). We chose the first two pos- 
itions because they are well characterized reference sites 
(36) and the latter six positions because they harbor 
pseudogenes and therefore their replacement with yfp 
should have minimal physiological impact, yfp was 
inserted with flanking upstream and downstream termin- 
ators on the leading strand and its transcription was 
controlled by the PLlacO-1 promoter which is repressed 
by LacI encoded by lad at the native position (7.9'). In 
addition, 78 nucleotides of the T7 bacteriophage 10 gene 
were fused to the 5' end of yfp to enhance transla- 
tional efficiency, yfp transcription was varied by adding 
different concentrations of inducer (isopropyl-P-D- 
thiogalactopyranoside, IPTG) to the media and its expres- 
sion was quantified by measuring YFP fluorescence. 

We measured the maximum YFP expression (1 mM 
IPTG) at the different positions on the leading strand 
and plotted it as a function of p, the shortest distance 
from its position to the origin of replication (oriC at 
84.6') (Figure IB). We found that gene expression 
decreased with p. This finding is consistent with Cooper 
and Helmstetter's model (4,6,37,38) which predicts that 
gene copy number decreases with increasing p due to 
chromosome replication. Assuming that gene expression 
(E) is proportional to the gene copy number then 



E(p) 



E 0 ■ 



(1) 



where E Q is the steady state expression of a gene at the 
origin (i.e. p = 0) in units of fluorescence, C is the average 
time to replicate the chromosome and r is the average cell 



doubling time. In non-synchronized, exponentially 
growing cells at steady state C = 47 min (39) and the 
average doubling time in our experiments was 20.9 min 
(Supplementary Figure SI). Equation (1) was fitted to the 
data with only one free parameter (E 0 , which has no effect 
on the slope) to generate a predicted relationship between 
the expression level and the distance of a gene from oriC 
and it was found to be in good agreement with the data 
(Figure IB). 

yfp was next placed on the lagging strand at eight sites 
plus three other sites (ykfC at 5.9', yjiP at 98.4' and ilvG' 
at 85.1'). Again, we found that expression decreased with 
the distance from oriC and the data agreed with the rela- 
tionship predicted by Equation (1) (Figure 1C). The dif- 
ference in the predicted intercept for the leading and 
lagging strands (5573 ± 354 a.u. and 5014 a.u. ± 151 
a.u., respectively) was thought to be due to measurement 
error between the different experiments, and this was con- 
firmed by measuring YFP expression in both orientations 
at four positions in the same experiment and showing 
there was no consistent difference in expression 
(Supplementary Figure S2). 

We directly tested the explanation that gene expression 
decreases with distance from oriC due to decreasing gene 
copy number and consequently decreasing mRNA con- 
centrations. Gene copy number was measured by quanti- 
tative PCR (Materials and Methods section) and it was 
found to decrease with the distance from oriC as predicted 
(Figure ID). 

To identify the specific aspects of gene regulation that 
depend on chromosome position we measured the induc- 
tion of gene expression at six chromosome positions 
by varying the IPTG concentration (Figure IE and 
Supplementary Table S6). The induction curve was fitted 
to a standard Hill-type function (40,41) to determine the 
Hill coefficient («), which measures the steepness of the 
curve and provides a measure of LacI cooperativity, and 
to determine the IPTG concentration at half-maximal in- 
duction (K x ), which is a measure of LacI affinity for the 
DNA. We found the Hill coefficient and the IPTG con- 
centration at half-maximal induction did not alter with 
chromosome position (Supplementary Figure S3). In 
contrast, maximum and minimum expression levels 
altered with the position but their ratio ('dynamic 
range') was relatively constant, with maximum expression 
being 8.66-fold ( ± 0.84) the minimum expression (linear 
regression, P = 0.0005, R 2 = 0.96) (Figure IF and 
Supplementary Figure S3). The ratio is constant because 
the level of LacI activity is constant (as measured by LacI 
cooperativity and the IPTG concentration required for 
half-maximal induction) and does not depend on the 
position of the target gene therefore it is simply propor- 
tional to the amount of ieaky' constitutive expression. 

In summary, our results show that constitutive expres- 
sion but not transcription factor activity depends on the 
chromosome position of a target gene. As a consequence, 
both maximum and minimum expressions alter with the 
chromosome position of a target gene but their ratio is 
constant. We additionally demonstrate that gene regula- 
tion does not depend on gene orientation. 
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Figure 3. The expression of the target gene without flanking terminators at different chromosome positions. Error bars indicate s.e.m. Color of the 
data symbols throughout the figure indicates positions in panel A. (A) Gene circuit used in these experiments and chromosome positions of yfp, 
origin and termination of replication, and lac I art reshown. Strains used are: yjht lagging (HL3779), yjiP lagging (HL4268), ykfC lagging (HL4237), 
galK leading (HL1951), jayE lagging (HL4238), arpB leading (HL3776), intS leading (HL2821), yfjV lagging (HL4236), yhdW leading (HL4235) 
and glvC lagging (HL3778). (B) Induction curves for yfp measured in triplicate. Lines indicate fits using the equation at the top of Figure IE. (C) 
Maximum YFP expression as a function of the shortest distance from the origin. (D) Maximum YFP expression as a function of minimum YFP 
expression for data shown in panel B. (E) Representative northern blot of yfp mRNA and 16S loading control at different chromosome positions for 
the strains in panel B. The shortest distance from the origin (min.) is in parentheses. Contrast and brightness were adjusted solely to enhance 
visualization of the printed figure: no bands were obscured or selectively enhanced. (F) The relative yfp mRNA concentration obtained from 
independent duplicate samples in separate northern blots as a function of chromosome position. Fit is the same as panel C except the j-intercept 
value (£o*) is the extrapolated normalized mRNA concentration at the origin. 



Effect of chromosome position on gene expression depends 
on cell growth rates 

Equation (1) shows that gene expression at different 
chromosome positions depends on the doubling time (r). 
This prediction was tested by measuring maximum gene 
expression (at 1 mM IPTG) at 10 positions in 3 types of 
media. The media were LB, M9 + 0.4% glucose and 
M9 + 0.4% glycerol which resulted in doubling times (r) 
of 20.9 ± 0.4, 61.9 ± 1.1 and 86.6 ± 3.2 min, respectively 
(Supplementary Figure SI). Expression at the different 
positions was normalized by the level obtained for the 
gene closest to the origin (glvC) for each growth condition 
(Figure 2). These measurements showed the relative dif- 
ference in expression with distance from the origin (i.e. the 
slope of the function) decreased as the doubling time 
increased. The doubling times were substituted into 
Equation (1) (C = 47 min and E Q = 1) and this theoretical 
function, which had no free parameters, agreed with the 



data. These results are in agreement with an earlier study 
(3) and show that relative differences in gene expression 
due to chromosome position can be predictably tuned by 
altering the cell growth rate. In addition, the results 
provide further support for the differences in constitutive 
expression being due to differences in gene copy number. 

Effect of local sequences on gene regulation 

The above experiments examined the effect of chromo- 
some position on a gene that is isolated from neighboring 
sequences by an upstream and two downstream termin- 
ator sequences. Therefore we now examine the expression 
of the target gene without flanking terminator sequences 
at 10 chromosome positions. At each position the gene 
was inserted in the reverse direction to the native orienta- 
tion of the gene it replaced to minimize the effect of native 
regulatory mechanisms and to prevent the reporter gene 
being transcribed within existing operons (Figure 3A). 
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half-maximal induction in the absence of IPTG, and h is the Hill coefficient for Lacl binding to operator sites at PLlacO-1. (B) Diagram showing the 
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for half-maximal induction (I 50 ) as a function of the relative Lacl expression. Black symbols indicate data and gray lines are the relationships 
predicted by Equation (4) with K\ = 6.3 x 10 5 and 1.0 x 10 6 M~' (upper and lower lines, respectively). 



Without the flanking terminators, transcription can occur 
into and out of the gene causing RNA polymerases to 
collide and prematurely terminate transcription (42), and 
it can result in the inclusion of additional sequences into 
the transcript that alter its degradation and folding. These 
potential effects on gene expression which are not specif- 
ically due to chromosome position were grouped together 
and termed 'local sequence effects'. 

We measured YFP expression at different IPTG con- 
centrations and these induction curves were fitted to a 
standard Hill-type function (40,41) (Figure 3B, 
Supplementary Table S5 and Supplementary Figure S4). 
We found that the maximum and minimum expression 
levels were the parameters that varied most among the 
positions, as was observed above when terminators were 
present. Maximum expression generally decreased with 
distance from the origin but there were clearly positions 
(circled in Figure 3C) where expression was lower than 
predicted by Equation (1); this indicates that local 
sequence effects at some positions have a strong effect 
on the constitutive expression from the translocated 
gene. The ratio of maximum and minimum expression 
across all the positions was relatively constant at 



8.44 ± 0.68 (linear regression: P< 0.0001 and R 2 = 0.95), 
which was a similar value to that obtained with termin- 
ators present (compare Figure 3D and Figure IF). 
Therefore we show again that transcription factor 
activity does not depend on the site of the target gene. 
In support of this the Hill coefficient and the IPTG con- 
centration required for half-maximal induction were rela- 
tively constant at the different target gene positions. 

To provide a quantitative assessment of the impact of 
local sequences on constitutive expression we calculated 
the 'relative displacement' which is the relative distance 
of the observed value for the maximum expression from 
the predicted value at each gene position (red line in 
Figures IB, C and 3C). That is, the relative displace- 
ment = (observed maximum expression - predicted 
maximum expression)/predicted maximum expression. 
For genes with terminators, there was little relative dis- 
placement with almost all observed values differing by 
<25% from the predicted maximum expression 
(Supplementary Figure S5). In contrast, genes without ter- 
minators showed much greater relative displacement with 
most chromosome positions having observed maximum 
expression levels that differed by more than 25% of the 
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predicted value (Supplementary Figure S5). This greater 
displacement indicates the larger impact of local sequences 
at different positions. 

Northern blots were performed and quantified as pre- 
viously described (33) with a probe located at the 5' end of 
yfp to determine the concentration and length of the 
mRNAs in the absence of the terminators (Figure 3E 
and Supplementary Figure S6). We found a range of dif- 
ferent length mRNAs within and between each chromo- 
some position. Some mRNAs were shorter than the 
expected length of the T7 lOr.yfp fusion (804 nucleotides) 
indicating possible premature termination or partial deg- 
radation and many mRNAs were longer indicating 



additional upstream or downstream sequences adjoined 
to the coding sequence. Since there were a range of 
mRNA lengths we measured the total concentration of 
T7 I0::yfp fusion mRNA that is at least the minimum 
length for the coding sequence and found that in general 
it decreased with distance from the origin as expected 
(Figure 3F). Some of the positions that had lower than 
expected YFP expression had mRNA concentrations that 
were higher than predicted (e.g. ykfC, green symbol in 
Figure 3C and F), indicating decreased translation 
(possibly due to altered mRNA folding). Other positions 
had reduced mRNA concentrations (e.g. yfjV, blue 
symbol in Figure 3C and F) indicating decreased tran- 
scription or increased mRNA degradation. These differ- 
ences suggest that altered mRNA lengths and sequences 
are altering mRNA lifetimes and translational efficiency 
via a variety of mechanisms. 

In summary, local sequences can alter gene expression 
by various mechanisms but their effect is limited to when 
terminators are absent and they only alter constitutive 
expression. Local sequences did not substantially influence 
transcription factor binding (i.e. they did not alter 
cooperativity or the IPTG concentration required for 
half-maximal induction). 

Using chromosome position to modulate gene regulation 

We sought to build a model of our repressor-target gene 
circuit that incorporates chromosome position effects and 
then use the model to predict how translocation of the 
transcription factor gene would alter the output of our 
circuit. The Hill-type function used above regards 
maximum and minimum expression as independent vari- 
ables, which was useful for obtaining empirical parameters 
from the induction curves; but we have now shown that 
this is not the case. Therefore an alternate form of the Hill 
function was used (Figure 4A and Supplementary 
Methods), which is similar to that reported by others 
(43,44), to determine the amount of expression E for a 
given IPTG concentration ([IPTG], units: M): 



E([IPTG]) 



^^(a'^iptgi+i) ^ 



where <p ■ 



[TotalR]y' 
(2) 



P is the expression efficiency (a.u.), K\ is the equilibrium 
association constant (M _1 ) for IPTG binding to Lacl, h is 
the Hill coefficient (unitless), [Total R] is the total repres- 
sor concentration (M) and K D is the repressor concentra- 
tion required for half-maximal induction in the absence of 
IPTG (M). fi depends on the mRNA concentration, 
mRNA translation and fluorescence yield per protein 
(Supplementary Methods). We showed earlier that <p is 
independent of yfp position which indicates that h is also 
independent of yfp position. Ki, which defines the affinity 
of IPTG for Lacl, is also independent of yfp position. It 
should also be noted that while Equation (2) is able to 
predict expression with known parameters it is not 
useful for obtaining parameter values due to large fitting 
errors (Supplementary Table S4). 
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Equation (2) shows that at saturating IPTG concentra- 
tions, gene expression is maximal and approximately 
equals the expression efficiency (/?). The expression effi- 
ciency is determined by the chromosome position of the 
target gene (yfp). At [IPTG] = 0, gene expression is at its 
minimum and equals +q>). The ratio of maximum to 
minimum expression (i.e. dynamic range) equals \+q>. 
That is, the dynamic range is independent of the expres- 
sion efficiency and therefore independent of chromosome 
position as was observed experimentally (Figures IF 
and 3D). 

Equation (2) also predicts the effect of lad position on 
target gene regulation. If altering the chromosome 
position of lad changes the total repressor concentration 
by / [the ratio of Lad expression at the new (Aaci) and 
original (J3 0 ) positions], the new minimum YFP expression 
will be equal to 



_P 
(l+cp-f) 



where /=Aad/A) 



(3) 



Because the position of lad alters minimum expression 
but not maximum expression it will also change the 
dynamic range. Furthermore, the position of lad will 
also affect [IPTG] needed for half-maximal induction 
([I50]) by (derivation in Supplementary Methods) 



[I 50 ] = ^|W'+2)»-l] 



(4) 



In summary, the model explains the observed effects of 
chromosome position on target gene (yfp) expression (i.e. 
altered maximum and minimum expression and a constant 
dynamic range) and it predicts the effect of lad position 
on gene regulation (i.e. altered minimum expression, [I 50 ] 
and dynamic range). 

We tested the model's predictions by placing lad at 
eight chromosome positions without terminator sequences 
and measuring its effect on two target genes (yfp at galK 
and cfp at intS) (Figure 4B). If the model is predictive for 
genes without terminator sequences then the model is 
robust. We found that as predicted, minimum expression 
changed with the position of lad and the maximum ex- 
pression did not (Figure 4C and D). We then examined 
whether minimum expression decreased as a function of 
the relative LacI expression according to the relationship 
specified by Equation (3). LacI expression at each position 
(Aaci)j which was assumed to be proportional to 
maximum YFP expression at the same location, was 
divided by expression at the native position (p 0 ) to yield 
the relative LacI expression. f) 0 was estimated to be 
equivalent to that at yfjV (f3 0 = 225 ± 28 a.u.) since lad 
at this position produced similar levels of repression 
(Figure 4C). Values for q> (7.44 ± 0.68), h (1.51 ± 0.03) 
and p (936 ± 121 a.u.) were obtained from the measure- 
ments for yfp without terminators (Materials and 
Methods section) and substituted into Equation (3). This 
yielded the predicted relationship between minimum YFP 
expression and relative LacI expression (with no free 
fitting parameters), which agreed with the observed 
values (Figure 4E). 



We next examined whether the IPTG concentrations 
required for half-maximal induction matched the predic- 
tions of the model. Values for q>, It and a lower and upper 
value for the equilibrium association constant for IPTG 
binding to LacI [Kj = 6.3 x 10 5 and 1.0 x 10 6 M" 1 , respect- 
ively, which are calculated from the reciprocal of the equi- 
librium dissociation constant (45^17)] were substituted 
into Equation (4). The resulting functions were consistent 
with the data at high levels of LacI (Figure 4F). At low 
LacI concentrations, the [IPTG] required for half- 
maximal induction was higher than expected. A 
probable explanation is that the ratio of intracellular to 
extracellular IPTG is not constant but varies with the 
external [IPTG] due to active transport by membrane 
pumps (48,49) and positive feedback regulation (50). 

The target-TF distance does not affect LacI repression 

It is increasingly recognized that the intracellular environ- 
ment is heterogeneous and the sites of gene transcription 
and translation are important (51,52). If the diffusion 
distance between the repressor gene (which is a site of 
repressor production due to the coupling of transcription 
and translation) and its target gene is important, we 
should expect the relative ratio of repression at two 
target genes to vary depending on their relative proximity 
to lad. We calculated the ratio of maximum repression at 
two target genes by taking the ratio of their minimum 
expression (termed the 'repression ratio') with lad at dif- 
ferent chromosome positions (Figure 5A). The chromo- 
some distance does not necessarily represent the physical 
distance between lad and each target gene but if diffusion 
is important then the repression ratio should vary substan- 
tially with lad position. However, we found the repression 
ratio was relatively constant for all lad positions, 
indicating that diffusion is not a major contributor to 
gene regulation, at least over relatively large distances 
(note: the repression ratio ^ 1 because cfp and yfp do 
not have identical transcription, translation efficiencies, 
mRNA and protein degradation rates and/or quantum 
yields per protein). 

We then examined whether there was increased effect- 
iveness of LacI repression over very short distances as has 
been proposed (23) by placing lad immediately upstream 
of cfp or yfp (Figure 5B). To measure the amount of re- 
pression we used the 'relative CFP/YFP repression' rather 
than the repression ratio to correct for any local effects of 
the lad gene on the promoter of the adjacent cfp or yfp. 
The relative CFP/YFP repression was calculated from the 
ratio of maximum to minimum CFP expression (at 1 and 
OmM IPTG) divided by the ratio of maximum to 
minimum YFP expression and the resulting value was 
normalized to the CFP/YFP ratio in the absence of LacI 
to compensate for a small effect that IPTG has on CFP 
and YFP expression independently of LacI. In essence, we 
are taking the ratio of the dynamic ranges for CFP and 
YFP expression and in the presence of lad there is a 
greater dynamic range for YFP than CFP, which causes 
the relative CFP/YFP repression to be <1. If a short dif- 
fusion distance is important then the relative CFP/YFP 
repression should be high when lad is close to cfp and low 
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when laclis close to yfp. However, we found no difference 
in the relative CFP/YFP repression with lad at different 
positions (Figure 5B). Therefore, the diffusion distance 
between a transcription factor gene and its target gene 
was not important even at short distances. 
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Effect of chromosome position on gene expression noise 

We investigated the effect of chromosome position on 
gene expression noise in our simple circuit. Gene expres- 
sion noise was quantified by the coefficient of variance 
(C.V.), which is simply the standard deviation (S.D.) of 
the expression divided by the mean with background 
autofluorescence subtracted (53). We first analyzed the 
data from the induction curves with yfp at different 
chromosome positions where mean expression varies 
with IPTG concentration and chromosome position. We 
calculated the C.V. for all the cells in the sample and for 
only those cells within 2 S.D.s from the mean; the latter 
was to demonstrate the values are not unduly determined 
by a small number of outliers. For each chromosome 
position, increasing the mean expression by increasing 
the IPTG concentration resulted in a decreasing 'C.V. 
trend'. Across chromosome positions, this decreasing 
trend shifted to the left or right but not up or down 
(Figure 6A). That is, the chromosome position of a 
target gene alters mean expression with little effect on 
gene expression noise. 

When lad was placed at different chromosome pos- 
itions we found the C.V. values for a given mean expres- 
sion level were the same (Figure 6B). That is, translocation 
of lad does not result in a shift up or down for the C.V. 
trend therefore it does not increase or decrease gene ex- 
pression noise. The lad position does affect the total LacI 
concentration and therefore minimum expression level; 
this establishes the lower bound of the downward 'trend' 
for each position and thus the maximum C.V. value at- 
tainable in the absence of IPTG (Figure 6B and 
Supplementary Figure S7). 

A simple stochastic model of gene expression was 
created to examine the noise sources. In the model, 
switching occurs between a 'repressed' state with LacI 
bound and an 'unrepressed' state with LacI unbound at 
rates determined by k_ x and k x (note: k_ x is dependent on 
the LacI concentration) (Figure 7A). Initially, we assumed 
the gene copy number (which depends on chromosome 
position) simply alters the number of mRNAs produced 
per unit time and thus the magnitude of the rate constant 
for transcription in the unrepressed state (k M )- The model 
also includes mRNA translation and degradation of 
mRNAs and proteins which do not vary. It has been 
reported that when transcription factor binding and un- 
binding occur at rates that are low compared with the 
transcription rate, 'bursts' of transcription occur and 
this becomes the dominant source of noise (16,17,54,55). 
Under these conditions (i.e. k x = k_ x = 0.0\k M \ note: all 
have units of min -1 because the LacI concentration is 
included within /<_i), altering mean expression by 
varying LacI binding or LacI concentration has a large 
effect on the C.V. compared with varying mean expression 
by varying the transcription rate k M (Figure 7B). 
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Figure 6. Effect of chromosome position on gene expression noise. 
Error bars indicate s.e.m. of each sample. Color of the data symbols 
indicates the chromosome positions in Figure 1A. Gene expression 
noise as a function of mean YFP expression at different target gene 
(yfp) positions (A) different lad positions (B). At top we show the gene 
circuit for each experiment. Upper plots show all positions and lower 
plots display only two positions. Gene expression noise for cfp is dis- 
played in Supplementary Figure S7. 



Experimentally this was observed as an increase in 
maximal C.V. that accompanies lower minimum expres- 
sion levels due to higher LacI concentrations. These simu- 
lations show that altering mean expression by changing 
the chromosome position and consequently the mRNA 
concentration has minimal effect on gene expression 
noise (compare Figure 7C and bottom panel Figure 6A) 
which is in agreement with our observations. 

We now compare two scenarios for transcription among 
multiple copies of the target gene (Scenarios 1 and 2, 
Figure 7D). To be clear, we refer to copies of the same 
gene at the 'same' position which arise during DNA rep- 
lication. In Scenario 1, switching between the repressed 
and unrepressed state, and transcription events are 
highly correlated because fluctuations in the LacI concen- 
tration are the dominant source of noise. This results in all 
the gene copies behaving the same which is essentially the 
scenario in the above simulation where we assumed there 
was one copy of the gene and varied k M . In Scenario 2, 
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Figure 7. Chromosome position has minimal impact on gene expression noise. (A) Model of gene expression as described in main text. (B) Simulated 
gene expression noise as a function of mean expression at varying values for k\, k_\ and k M . Initially k\ = Ar_i = 0.001 concentration-time -1 and 
/f M = 0.1 concentration-time -1 . Each value was independently varied by 0.25-, 0.5-, 2- and 4-fold. (C) Varying the rate of Lacl binding at two 
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expression for HL1852 and HL2028. ^-values are calculated from all cells or for only cells within 2 S.D. of the mean (values in parentheses). 



switching and transcription events at each copy of the 
gene are independent and most of the noise occurs due 
to random switching events at individual promoters. 
Under this scenario, increasing the copy number of a 
gene by moving it closer to the origin leads to a more 



stable average for the occupancy of the repressed and 
unrepressed states. That is, increasing mRNA production 
by increasing the number of independent copies of a gene 
(thereby indirectly increasing k M ) results in a lower C.V. 
than simply increasing the k M for a single copy gene 
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(Figure 7E). However, our experimental data found no 
evidence to support Scenario 2 (Figure 6A). That is, 
genes that are closer to the origin did not have a lower 
C.V. for a given mean indicating that transcription from 
each gene copy is highly correlated; we now show that this 
is most likely due to stochastic fluctuations in the LacI 
concentration. 

If fluctuations in LacI are the dominant source of noise 
as the above suggests, then two genes under LacI control 
should increase and decrease together resulting in a 
positive correlation coefficient (R>>0). It is not 
possible to measure the expression noise of different 
copies of the same gene at the same position. However, 
the above also applies to genes under LacI control at dif- 
ferent positions. Therefore, we measured the correlation 
coefficient for cfp and yfp under the control of LacI at two 
positions that are approximately equidistant from oriC 
(intS and galK). We found that at most levels of expres- 
sion R was >0.5, which is consistent with a source of noise 
causing CFP and YFP expression to increase and decrease 
together (i.e. highly correlated expression) (green symbols, 
Figure 7F and G). To demonstrate that LacI is responsible 
for cfp and yfp expression being correlated we deleted lad 
and showed that this causes the correlated expression to 
decrease from 0.78 to 0.25 (black symbols, Figure 7F and 
H). That is, transcription becomes independent and less 
correlated without LacI. 

Together our experimental findings and stochastic simu- 
lations show that chromosome position, which alters 
mean expression via its effect on gene copy number and 
thus the mRNA concentration, has minimal impact on 
expression noise compared with fluctuations in the LacI 
concentration. That is, changing gene position, and there- 
fore the gene copy number and local sequence effects, has 
minimal impact on gene expression noise. Increasing the 
mRNA concentrations by increasing promoter strength 
without altering transcription factor binding would 
be expected to have a similar effect to increasing gene 
copy number and this has been shown to be the case in 
yeast (56). 



DISCUSSION 

This study examined the effect on gene regulation of three 
features of genome organization that may alter with gene 
translocation (chromosome position, gene orientation and 
the relative distance between interacting genes). 
Remarkably, only chromosome position had a demon- 
strable effect and it primarily determined the maximum 
and minimum expression levels of the target gene. More 
specifically, maximum and minimum expression levels 
decreased with the distance of the target gene from the 
origin and we showed that this was associated with a 
decrease in gene copy number. These findings are consist- 
ent with Cooper and Helmstetter's model (6). 
Surprisingly, transcription factor activity (as measured 
by the amount of repression, the Hill coefficient and the 
IPTG concentration required for half-maximal induction) 
was constant; which contrasts with results reported in eu- 
karyotes. This likely reflects the prominent role that long 



distance regulatory mechanisms such as enhancers and 
silencers, and chromatin modification have in the eukary- 
otes [reviewed in (57)] and which are absent in E. coli. 

Our study showed that local sequences at different 
chromosome positions can modulate gene regulation so 
that gene expression deviates from that predicted to 
occur solely due to changes in gene copy number. Local 
sequences can alter expression via alterations in mRNA 
length, concentration and translation. However, local 
sequence effects appear to be limited to modulating con- 
stitutive gene expression levels and are relatively small 
unless the translocated gene lacks flanking terminators. 
In studies that have translocated genes for LacZ (5) and 
histidinol dehydrogenase (3), the difference between the 
observed values and the values predicted by the Copper- 
Helmstetter model also appears to be relatively small 
(i.e. a small relative displacement), indicating that local 
sequences also have limited effect on the constitutive 
expression of non-fluorescent genes. 

We did not specifically assess if there were any periodic 
changes in expression with distance from the origin that 
would be compatible with regular DNA compaction and 
coiling along the chromosome. However, the expression 
data clearly indicate that any periodicity that exists must 
have a relatively small effect on gene expression because 
the pattern observed deviated very little from that pre- 
dicted solely by the gene copy number. That is, if DNA 
looping and coiling had a large effect on expression then 
the measured values would be expected to display large 
deviations from the predicted fits depending on whether 
the gene was situated in a position that is affected or not; 
and this was not observed in our study (Figure IB and C) 
or in a previous study that placed LacZ at different 
chromosome positions (5). 

Gene orientation had no demonstrable effect on gene 
expression despite our measurements being performed at 
the fastest growth rates in LB media when collisions 
between RNA polymerases and the replisomes should be 
most common. This indicates that collisions do not affect 
genes on the leading and lagging strands differently, in 
agreement with comparative genome analyses that 
suggest that gene orientation is associated with the func- 
tional class of a gene rather than its expression level (19). 

The distance between the target gene and the transcrip- 
tion factor gene did not affect the amount of transcription 
factor (LacI) activity at the target gene. Although LacI is a 
common paradigm for studying diffusion of transcription 
factors, it's relatively high association rate constant 
compared with many other proteins, may mean that its 
facilitated diffusion rate and the time taken to bind its 
targets may not be representative (58). Furthermore we 
could not examine whether the target-TF distance 
provides an advantage in the dynamics when a transcrip- 
tion factor gene is first turned on or when the transcription 
factor's half-life is very short. With these limitations, the 
possibility that the target-TF distance may be functionally 
important for some proteins and under some conditions 
cannot be excluded. We stress that while we find no 
evidence that the distance between a target gene and its 
transcription factor gene affects transcription factor 
activity, we are not suggesting that the spatial 
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organization within the cell is not important for gene regu- 
lation; in fact, there is clear evidence that it is (59). 

We found that most of the target gene's expression 
noise arises from fluctuations in the transcription factor 
(LacI) concentration. However, LacI fluctuations did not 
scale with the number of LacI molecules produced (i.e. the 
relationship between gene expression noise and mean ex- 
pression did not alter with lacTs position). Consequently 
the primary source of target gene expression noise is not 
the intrinsic fluctuations in LacI production but other ex- 
trinsic factors such as the transcription factors that 
regulate lacl. Therefore there appears to be a consistent 
pattern with the dominant source of expression noise for 
both lacl and yfp being due to extrinsic factors that 
regulate their expression. In comparison, chromosome 
position has minimal impact on gene expression noise. 
This finding differs from reports in yeast and mammalian 
cells and is most likely due to bacteria lacking enhancers, 
heterochromatin and other factors (16,60,61) that are 
associated with infrequent, stochastic switching between 
the active and silent expression states (16,17). 

This study demonstrates that gene translocation is a 
potential mechanism for reprogramming the output of 
synthetic gene networks. Moreover the Cooper- 
Helmstetter and/or Hill-type functions can be used to re- 
program gene regulation and networks in a predictable 
manner. Alternative methods for controlling expression 
levels through transcription factor regulation and the 
introduction of mutations at the promoter (62) and RBS 
(63) can produce a greater fold change in expression than 
was observed but they also have disadvantages. Tuning 
gene expression by altering transcription factor binding 
can alter gene expression noise as we have shown. 
Introducing mutations to vary the strength of the 
promoter or RBS may disrupt sites needed for transcrip- 
tion factor and sRNA regulation. These problems are po- 
tentially avoided by selecting different chromosome 
positions to vary expression because the integrity of the 
gene is maintained. Our observation that Lacl expression 
can be empirically predicted from YFP expression at the 
same position (despite different promoters, ribosome 
binding sequences and coding sequences, and the 
absence of flanking terminators) indicates that chromo- 
some position effects are largely sequence independent, 
and therefore broadly applicable. This is supported by 
earlier studies that have shown chromosome position 
effects with LacZ (5) and histidinol dehydrogenase (3) in 
E. coli and Salmonella typhimurium. 

Determining the contribution of spatial information to 
signaling in gene networks is challenging but it is essential 
to understanding the evolution of genome organization 
and for choosing the optimal position of genes in 
engineered genomes and circuits. Comparative genome 
sequence analyses have identified multiple features of 
genome organization that are common and conserved. 
Several functional and non-functional (e.g. genetic 
linkage) hypotheses have been proposed for why these 
organizational features arise. Synthetic gene circuits are 
ideal for directly testing these hypotheses of function 
and causality. Furthermore synthetic gene circuits allow 
specific features of genome organization to be isolated and 



systematically manipulated thereby providing detailed in- 
formation about the reaction steps and dose-response re- 
lationships that are necessary for mathematical modeling 
and predictive analyses. This level of detail and direct 
testing of function cannot generally be readily extracted 
from the analyses of microarrays and genome sequences. 
These advantages make synthetic genetic circuits a 
powerful adjunct to high-throughput and bioinformatics 
studies as we have shown here. Other common features of 
genome organization (e.g. gene colocalization) and gene 
arrangements (e.g. operons and overlapping genes) that 
are believed to be functionally important should also be 
characterized using synthetic gene circuits. 

In conclusion, our study assessed how common features 
of genome organization may alter the regulation of 
translocated genes and consequently the output of gene 
networks. We identified features of genome organization 
that are likely to have an impact on gene regulation (i.e. 
chromosome position) and those that are generally not 
likely to have an impact (i.e. gene orientation and 
target-TF distance). Our data not only provide an under- 
standing of the potential regulatory consequences of gene 
translocation but also yield insight into fundamental 
aspects of gene transcription, translation and intracellular 
signaling. The insights include: (i) the impact of gene copy 
number on mean expression as well as gene expression 
noise; (ii) the mechanisms by which local sequences alter 
gene expression; (iii) the minimal effect of diffusion dis- 
tances on transcription factor activity; and (iv) the dem- 
onstration that collisions between the transcription 
machinery and replisomes do not modulate gene expres- 
sion. These findings provide important constraints and 
bounds on the relative rates and magnitudes of different 
processes in gene regulation, which may be incorporated 
into general models of gene regulation. These results in 
E. coli are likely to be applicable to other closely related 
bacteria such as Salmonella (and perhaps more broadly) 
due to similarities in their genome organization and mech- 
anisms of DNA replication and gene expression. 
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